Statistical evaluation of SAGE libraries: consequences for experimental design.
نویسندگان
چکیده
Since the introduction of serial analysis of gene expression (SAGE) as a method to quantitatively analyze the differential expression of genes, several statistical tests have been published for the pairwise comparison of SAGE libraries. Testing the difference between the number of specific tags found in two SAGE libraries is hampered by the fact that each SAGE library is only one measurement: the necessary information on biological variation or experimental precision is not available. In the currently available tests, a measure of this variance is obtained from simulation or based on the properties of the tag distribution. To help the user of SAGE to decide between these tests, five different pairwise tests have been compared by determining the critical values, that is, the lowest number of tags that, given an observed number of tags in one library, needs to be found in the other library to result in a significant P value. The five tests included in this comparison are SAGE300, the tests described by Madden et al. (Oncogene 15: 1079-1085, 1997) and by Audic and Claverie (Genome Res 7: 986-995, 1997), Fisher's Exact test, and the Z test, which is equivalent to the chi-squared test. The comparison showed that, for SAGE libraries of equal as well as different size, SAGE300, Fisher's Exact test, Z test, and the Audic and Claverie test have critical values within 1.5% of each other. This indicates that these four tests will give essentially the same results when applied to SAGE libraries. The Madden test, which can only be used for libraries of similar size, is, with 25% higher critical values, more conservative, probably because the variance measure in its test statistic is not appropriate for hypothesis testing. The consequences for the choice of SAGE library sizes are discussed.
منابع مشابه
Identification and prevention of a GC content bias in SAGE libraries.
Serial Analysis of Gene Expression (SAGE) is becoming a widely used gene expression profiling method for the study of development, cancer and other human diseases. Investigators using SAGE rely heavily on the quantitative aspect of this method for cataloging gene expression and comparing multiple SAGE libraries. We have developed additional computational and statistical tools to assess the qual...
متن کاملPresenting a Framework for Supporting Life-long Learning in Iranian public libraries and Its validation
Purpose: Since nowadays public libraries are considered lifelong learning centers, these centers must have the required standards and conditions to support lifelong learning in order that they could help society members to achieve their personal and professional learning more effectively. Accordingly, it is necessary to develop and provide a mechanism to support lifelong learning in public libr...
متن کاملStatistical modeling of sequencing errors in SAGE libraries.
MOTIVATION Sequencing errors may bias the gene expression measurements made by Serial Analysis of Gene Expression (SAGE). They may introduce non-existent tags at low abundance and decrease the real abundance of other tags. These effects are increased in the longer tags generated in LongSAGE libraries. Current sequencing technology generates quite accurate estimates of sequencing error rates. He...
متن کاملFull transcriptome analysis of rhabdomyosarcoma, normal, and fetal skeletal muscle: statistical comparison of multiple SAGE libraries.
Rhabdomyosarcoma (RMS) is the most frequent soft tissue sarcoma in children. Improved treatment strategies have increased overall survival, but the response of approximately one-third of the patients is still poor. To increase the knowledge of RMS pathogenesis, we performed the first full transcriptome analysis of RMS using serial analysis of gene expression (SAGE). With a G-test for the simult...
متن کاملMesothelin is overexpressed in the vast majority of ductal adenocarcinomas of the pancreas: identification of a new pancreatic cancer marker by serial analysis of gene expression (SAGE).
PURPOSE Effective new markers of pancreatic carcinoma are urgently needed. In a previous analysis of gene expression in pancreatic adenocarcinoma using serial analysis of gene expression (SAGE), we found that the tag for the mesothelin mRNA transcript was present in seven of eight SAGE libraries derived from pancreatic carcinomas but not in the two SAGE libraries derived from normal pancreatic ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Physiological genomics
دوره 11 2 شماره
صفحات -
تاریخ انتشار 2002